System and method for ofdm symbol receiving and processing

ABSTRACT

A method for receiving and processing orthogonal frequency division multiplexing (OFDM) signals, the method may include receiving a stream of OFDM symbols; searching, by a timing circuit, in the stream of OFDM symbols, for a training sequence that comprises a first Golay codeword and a second Golay codeword; processing, by a timing circuit, the training sequence and extracting timing information about a timing of reception of OFDM symbols, out of the stream of OFDM symbols, that convey data; wherein a sum of an autocorrelation of the first Golay codeword and an autocorrelation of the second Golay codeword consists essentially of a delta function.

This application claims priority from U.S. provisional patent Ser. No. 62/020,467 filing date Jul. 3, 2014, which is incorporated herein by reference.

BACKGROUND

The following articles provide a brief description of the prior art:

-   [1] T. M. Schmidl and D. C. Cox, “Robust frequency and timing     synchronization for OFDM,” IEEE Transactions on Communications, vol.     45, no. 12, pp. 1613-1621, 1997. -   [2] T. M. Schmidl and D. C. Cox, “Low-overhead, low-complexity     [burst] synchronization for OFDM,” Proceedings of ICC/SUPERCOMM     '96—International Conference on Communications, vol. 3, pp.     1301-1306. -   [3] A. Bar, A. Tolmachev, M. Nazarathy, and S. Member, “In-Service     Monitoring of Chromatic Dispersion With Filter-Bank Digitally     Sub-Banded OFDM,” vol. 25, no. 22, pp. 2189-2192, 2013. -   [4] R. Goldman, A. Agmon, M. Nazarathy, and S. Member, “Direct     Detection and Coherent Optical Time-Domain Reflectometry With Golay     Complementary Codes,” Journal of Lightwave Technology, vol. 31, no.     13, pp. 2207-2222, 2013. -   [5] M. Nazarathy and A. Tolmachev, “Sub Banded DSP Architectures     Based on Underdecimatedon Underdecimated Filter Banks for Coherent     OFDM Receivers,” IEEE Signal processing magazine, no. February 2014,     pp. 70-81. -   [6] M. Nazarathy and A. Tolmachev, “Filter-bank based digital     sub-banding ASIC architecture for coherent optical receivers,” vol.     8647, p. 86470J, January 2013. -   [7] H. Minn, V. K. Bhargava, and K. B. Letaief, “A robust timing and     frequency synchronization for OFDM systems,” IEEE Transactions on     Wireless Communications, vol. 24, no. 5, pp. 822-839, May 2003.

There is a growing need to improve the carrier andrfrequency recovery of OFDM signals.

SUMMARY

There are provided an OFDM receiver, and a method for receiving and processing OFDM symbols according to various embodiments of the invention.

There may be provided an orthogonal frequency division multiplexing (OFDM) receiver, may include: an input port that may be configured to receive a stream of OFDM symbols; a timing circuit that may be configured to search, in the stream of OFDM symbols, for a training sequence that may include a first Golay codeword and a second Golay codeword and to process the training sequence and extract timing information about a timing of reception of OFDM symbols, out of the stream of OFDM symbols, that convey data; wherein the sum of an autocorrelation of the first Golay codeword and an autocorrelation of the second Golay codeword consists essentially of a delta function.

The first Golay codeword and the second Golay codeword may be separated from each other by at least one padding bit.

The first Golay codeword and the second Golay codeword may not be separated from each other by any padding bits.

At least ninety percent of energy of the sum may belong to the delta function.

The sum may consist only of the delta function.

The delta function may have a peak that equals twice a length of the first Golay codeword.

The timing circuit may not include multiplication units.

The OFDM symbol stream may include multiple interleaved sequences of oversampled data symbols; wherein each sequence of oversampled data symbols may include a training sequence candidate; wherein the timing circuit may be configured to select a selected training sequence out of multiple training sequence candidates of the OFDM sequence stream.

The timing circuit may be configured to calculate cross-correlations peaks by cross correlating between each of the multiple training sequence candidate and a reference training sequence that may include the first Golay codeword and the second Golay codeword.

The timing circuit may be configured to select the selected training sequence in response to the cross-correlation peaks.

The timing circuit may be configured to select as the selected training sequence a selected training sequence candidate having a biggest cross correlation peak out of the cross correlation peaks.

The timing circuit may be configured to define a timing reference point as a location of the cross correlation peak of the selected training sequence.

The timing circuit may be configured to compare the cross correlation peak of the selected training sequence to a cross correlation peak of at least one training sequence candidate that differs from the selected training sequence to provide a comparison result; and to determine a fractional timing offset based upon the comparison.

The OFDM receiver may include a frequency offset determination circuit that may be configured to determine a frequency offset of the OFDM sequence in response to a value of the cross correlation peak of the selected training sequence.

The frequency offset determination circuit may be configured to: divide the first Golay codeword of the selected training sequence into multiple first subsets; calculate first averages of cross correlations between the multiple first subsets and a corresponding reference First Golay codeword subsets; divide the second Golay codeword of the selected training sequence into multiple second subsets; calculate second averages of cross correlations between the multiple second subsets and a corresponding reference Second Golay codeword subsets; extract phase difference between first averages and corresponding averages; and determine the frequency offset of the OFDM sequence in response to the phase differences.

The circuit may include: a first cross-correlation circuit that may include first taps and may be configured to search for the first Golay codeword; a second cross-correlation circuit that may include second taps and may be configured to search for the second Golay codeword; wherein the frequency offset determination circuit may include multiple phase detectors that are configured to calculate phase differences between output signals of different taps of the first taps and the second taps; wherein the frequency offset determination circuit may be configured to determine the frequency offset of the OFDM sequence in response to the phase differences.

The multiple phase detectors comprise first phase detectors that are configured to calculate phase differences between output signals of different first taps; second phase detectors that are configured to calculate phase differences between output signals of different second taps.

The multiple phase detectors comprise a phase detector that may be configured to calculate a phase difference between a first output signal of a first tap and a second output signal of a second tap.

A method for receiving and processing orthogonal frequency division multiplexing (OFDM) signals, the method may include: receiving a stream of OFDM symbols; searching, by a timing circuit, in the stream of OFDM symbols, for a training sequence that may include a first Golay codeword and a second Golay codeword; processing, by a timing circuit, the training sequence and extracting timing information about a timing of reception of OFDM symbols, out of the stream of OFDM symbols, that convey data; wherein a sum of an autocorrelation of the first Golay codeword and an autocorrelation of the second Golay codeword consists essentially of a delta function.

BRIEF DESCRIPTION OF THE DRAWINGS

The subject matter regarded as the invention is particularly pointed out and distinctly claimed in the concluding portion of the specification. The invention, however, both as to organization and method of operation, together with objects, features, and advantages thereof, may best be understood by reference to the following detailed description when read with the accompanying drawings in which:

FIG. 1 illustrates an autocorrelation of a first Golay codeword, an autocorrelation of a second Golay codeword and a sum of the autocorrelations according to an embodiment of the invention;

FIG. 2 illustrates various formats of a first Golay codeword according to an embodiment of the invention;

FIG. 3 illustrates autocorrelations according to an embodiment of the invention;

FIG. 4 illustrates an OFDM transmitter according to an embodiment of the invention;

FIG. 5 illustrates an OFDM receiver according to an embodiment of the invention;

FIG. 6 illustrates digital data samples and analog data samples as well as sampling points according to an embodiment of the invention;

FIG. 7 illustrates intensities of mainlobes and sidelobes according to an embodiment of the invention;

FIG. 8 illustrates a cross correlation that is applied on data according to an embodiment of the invention;

FIG. 9 illustrates phase errors applied on a training sequence according to an embodiment of the invention;

FIG. 10 illustrates a timing circuit according to an embodiment of the invention;

FIG. 11 illustrates a frequency offset determination circuit according to an embodiment of the invention;

FIG. 12 illustrates simulation results according to an embodiment of the invention;

FIG. 13 illustrates simulation results according to an embodiment of the invention;

FIG. 14 illustrates simulation results according to an embodiment of the invention;

FIG. 15 illustrates simulation results according to an embodiment of the invention;

FIG. 16 illustrates simulation results according to an embodiment of the invention;

FIG. 17 illustrates simulation results according to an embodiment of the invention;

FIG. 18 illustrates simulation results according to an embodiment of the invention;

FIG. 19 illustrates phase detectors according to an embodiment of the invention;

FIG. 20 illustrates a frequency offset determination circuit according to an embodiment of the invention;

FIG. 21 illustrates a frequency offset determination circuit according to an embodiment of the invention;

FIG. 22 illustrates a frequency offset determination circuit according to an embodiment of the invention;

FIG. 23 illustrates a frequency offset determination circuit according to an embodiment of the invention;

FIG. 24 illustrates a frequency offset determination circuit according to an embodiment of the invention;

FIG. 25 illustrates a frequency offset determination circuit according to an embodiment of the invention;

FIG. 26 illustrates a frequency offset determination circuit according to an embodiment of the invention;

FIG. 27 illustrates a frequency offset determination circuit according to an embodiment of the invention;

FIG. 28 illustrates a frequency offset determination circuit according to an embodiment of the invention;

FIG. 29 illustrates a frequency offset determination circuit according to an embodiment of the invention;

FIG. 30 illustrates a frequency offset determination circuit according to an embodiment of the invention;

FIG. 31 illustrates a frequency offset determination circuit according to an embodiment of the invention;

FIG. 32 illustrates a frequency offset determination circuit according to an embodiment of the invention;

FIG. 33 illustrates a method according to an embodiment of the invention;

FIG. 34 illustrates a component of a receiver according to an embodiment of the invention; and

FIGS. 35-39 illustrate simulation results according to various embodiments of the invention.

DETAILED DESCRIPTION OF THE DRAWINGS

In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the invention. However, it will be understood by those skilled in the art that the present invention may be practiced without these specific details. In other instances, well-known methods, procedures, and components have not been described in detail so as not to obscure the present invention.

The subject matter regarded as the invention is particularly pointed out and distinctly claimed in the concluding portion of the specification. The invention, however, both as to organization and method of operation, together with objects, features, and advantages thereof, may best be understood by reference to the following detailed description when read with the accompanying drawings.

It will be appreciated that for simplicity and clarity of illustration, elements shown in the figures have not necessarily been drawn to scale. For example, the dimensions of some of the elements may be exaggerated relative to other elements for clarity. Further, where considered appropriate, reference numerals may be repeated among the figures to indicate corresponding or analogous elements.

Because the illustrated embodiments of the present invention may for the most part, be implemented using electronic components and circuits known to those skilled in the art, details will not be explained in any greater extent than that considered necessary as illustrated above, for the understanding and appreciation of the underlying concepts of the present invention and in order not to obfuscate or distract from the teachings of the present invention.

Any reference in the specification to a method should be applied mutatis mutandis to a system capable of executing the method.

Any reference in the specification to a system should be applied mutatis mutandis to a method that may be executed by the system.

There is provided a system and a method for an improved Coarse Timing Offset (CTO) and Carrier Frequency Offset (CFO) recovery algorithm for OFDM and its variants, based on the so-called Golay Complementary Sequences (GCC). The scheme is suitable for filter-bank based (optionally DFT-Spread OFDM) the signal processing entirely multiplier-free and the autocorrelation features a distinct single-sample peak well above the sidelobes even in the presence of multiple strong channel impairments.

Although OFDM-based optical transmission has not yet been commercially deployed, intense research continues in this promising direction as the OFDM approach continually evolves and improved multiple variants are introduced, such as DFT-spread (DFT-S) OFDM, filter-bank based sub-banded OFDM [SPM, Hauske] and combinations thereof.

In all OFDM variants, training sequences are used for coarse timing offset (CTO) recovery as well carrier frequency offset (CFO) recovery, i.e. estimation and correction of the timing window to perform the receiver FFT for OFDM detection, as well as the estimation of the CFO and its subsequent cancellation (note: “coarse” in the CTO term means up to one integer sample—as fractional delay is up to the OFDM one-tap OFDM equalizer to mitigate). The transmitted signal consists of (optionally DFT-S) OFDM data symbols interspersed with training symbols launched at low repetition rate. The received signal is subjected to Delay&Correlate (D&C) (moving window cross-correlation) processing, continually searching for the presence of an auto-correlation (ACOR) peak of the transmitted training symbol. Ideally, sharp ACOR peaks should be generated even in the presence of noise and impairments. Inserting a sequence with good ACOR properties within identical sections of the training symbol enables both D&C processing for CTO estimation, as well as related Delay&Phase-compare (D&PC) processing for CFO estimation. The pioneering work in this area [1], [2] yielded the Schmidl-Cox [S-C] algorithm based on transmitting the training symbol (A, A), with A some pseudorandom sequence, and applying D&C to the two halves.

Improved algorithms such as Minn's followed suite in the wireless area. In our previous works on digitally sub-banded OFDM systems as recently reviewed in a tutorial [SPM], we have ported the Minn algorithm from wireless communication to optical transmission for the purposes of CTO and CFO recovery sub-band by sub-band; We have also applied the Minn scheme for Chromatic Dispersion (CD) estimation [3]. Our Minn estimator is based on a training sequence of the for, (A, A, −A, −A) wherein A is a Zadoff-Chu CAZAC-type finite sequence with “good” ACOR (narrow-peak (no plateau) and high mainlobe-to-sidelobes rejection ratio (MSRR)). There already exists extensive optical communication literature on using CAZAC sequences for CD and Polarization (2×2 MIMO channel) estimation purposes, rather than for CTO and CFO estimation as advocated in our previous Minn-based work. However, usage of Golay sequences has been suggested as an alternative to using CAZAC sequences. In the framework of optical communication, GCC have already been demonstrated, albeit in the frequency-domain for Polarization (POL) 2×2 MIMO channel estimation as well as CD estimation. There have been other instances of the useful tool of Golay sequences in the field optical sensing.

There is provided an improved CTO and CFO recovery algorithm for OFDM and its variants, based on the so-called Golay Complementary Codes (GCC), which are pairs of sequences yielding a discrete delta-sequence upon summing up their a-periodic autocorrelations. The novel proposed GCC-based scheme will be shown to outperform the state-of-the-art Minn-based scheme for CTO and CFO estimation purposes as well as provide even lower complexity than D&C techniques—since Golay cross-correlation is multiplier-free, just performing additions, as Golay sequences are ±1-valued.

Our concept has been inspired by our work on fiber sensing, specifically Optical Time Domain Reflectometry (OTDR) [4], wherein the fiber is sequentially probed by pairs of Golay sequences and the optical backscatter “echo” is cross-correlated with the two Golay codewords in turn, separated in time by a guard interval, followed by suitable digital signal processing (DSP) consisting of adding up the cross-correlations of the received signals with the two codewords. In contrast, our multiplier-free GCC estimator is based on time-domain processing, reducing to adding multiplications by ±1 (i.e. signed additions) which is far less complex than frequency-domain Golay processing (which requires complex multipliers with arbitrary values, as the Golay spectra are pseudo-random).

Extra advantages of the proposed GCC estimator beyond being multiplier-free and sharp-peaked (high MSRR and narrow mainlobe, ideally single distinct peak): When used in conjunction with filter-bank based DFT-S OFDM, which is our main interest, the GCC estimator is highly tolerant of CD, PMD impairments but is less tolerant of CFO relative to the Minn, scheme, although its CFO tolerance is sufficient for sub-band processing); Our scheme features a ˜3 dB OSNR advantage with respect to ASE-induced white noise while displays about the same tolerance with respect to laser phase noise; Our scheme supports twice oversampling (as required in under-decimated filter-bank based digital sub-banding) without incurring increased computational complexity due to the processing of interpolated values. It further features decoupled variable frame sizes operation in both its CTO and CFO estimation modes, for flexible overhead.

I. Golay Complementary Codes Review

Given two complex-valued sequences, both assumed infinite, A={A_(k)}_(k=−∞) ^(∞), B={B_(k)}_(k=−∞) ^(∞), i.e., defined over the domain of integers, R, then their (a-periodic) cross-correlation (XCOR) is the following sequence:

$\begin{matrix} \begin{matrix} {\Gamma_{AB} \equiv {A\; \bullet \; B}} \\ {\equiv {A \otimes B^{\dagger}}} \\ {= {\sum\limits_{k^{\prime} = {- \infty}}^{\infty}{A_{k^{\prime}}B_{k^{\prime} - k}^{*}}}} \\ {= {\sum\limits_{k^{\prime} = {- \infty}}^{\infty}{A_{k^{\prime} + k}B_{k^{\prime}}^{*}}}} \end{matrix} & (1) \end{matrix}$

where {circle around (x)} denotes convolution and  denotes XCOR. In particular, the autocorrelation (ACOR) of a sequence A is:

Γ_(A)≡Γ_(AA) =AA=Σ _(k′=−∞) ^(∞) A _(k′+k) A _(k′)*  (2)

If the two sequences are finite, say, each containing L points, e.g., with support {0, 1, . . . , L−1}, then they are assumed zero-padded to become infinite. Then the support of their XCOR is {−(L−1), . . . , −1, 0, 1, . . . , L−1}, containing 2L−1 points.

Evident properties of the XCOR (and ACOR) are that the XCOR is associative, distributive but not commutative:

BA ^(†)=(AB)^(†) =A ^(†) *B  (3)

where the conjugate-reflection or para-conjugation operation on a sequence is defined as follows:

s _(k) ^(†) [k]≡s*[−k]  (4)

The delay property of the XCOR reads (with D^(k) ⁰ s[k]≡s[k−k₀] the delay operator):

(D ^(k) ^(A) A)(D ^(k) ^(B) B)=D ^(k) ^(A) ^(−k) ^(B) (AB)  (5)

In particular

(D ^(k) ^(A) A)B=D ^(k) ^(A) (AB)≡D ^(k) ^(A) AB

(D ^(k) ⁰ A)(D ^(k) ⁰ B)=AB  (6)

Complementary sequences (CS) are pairs of sequences with the useful property that their out-of-phase aperiodic autocorrelation coefficients sum to zero.

A complementary pair a, b may be encoded as polynomials A(z)=a(0)+a(1)z+ . . . +a(N−1)z^(N−1) and similarly for B(z). The complementarity property of the sequences is equivalent to the condition |A(z)|²+|B(z)|²=2N for all z on the unit circle, that is, |z|=1. If so, A and B form a Golay pair of polynomials. Examples include the Shapiro polynomials, which give rise to complementary sequences of length a power of 2.

A Golay Complementary Code (GCC) is defined as a pair (G_(L) ⁽¹⁾, G_(L) ⁽²⁾) of sequences of length L (the Golay codewords) with unimodular elements,

|G _(L) ^((i)) [k]|=1,i=1,2;k=0, 1, . . . , L−1

satisfying the following complementary ACOR property:

G _(L) ⁽¹⁾ [k]G _(L) ⁽¹⁾ [k]+G _(L) ⁽²⁾ [k]G _(L) ⁽²⁾ [k]=2Lδ[k]   (7)

with δ[k] the discrete-time impulse. Note that while the ACOR of each GCC codeword may have non-zero sidelobes, once the two ACORs are summed up, their sidelobes perfectly cancel out, while the peak doubles up.

FIG. 1 illustrates an autocorrelation 10 of the first Golay codeword, an autocorrelation 20 of the second Golay codeword and a sum 30 of these autocorrelations. Autocorrelation 10 includes peak (mainlobe) 11 and sidelobes 12. Autocorrelation 20 includes peak (mainlobe) 21 and sidelobes 22. Sidelobes 21 cancel sidelobes 11 when autocorrelations 10 and 20 are added to each other to provide peak 31 and zero sidelobes 32. It is noted that the peak may have a value of 2L. It is noted that the sum may have some non-zero sidelobes and that the peak may be of a value that differs from 2L.

For a power-of-two length L, a ±1-valued GCC may be recursively constructed by concatenating half-length Golay pairs as follows:

G _(L) ⁽¹⁾ ≡[G _(L/2) ⁽¹⁾ ,G _(L/2) ⁽²⁾ ];G ₁ ⁽²⁾ =[G _(L/2) ⁽¹⁾ −G _(L/2) ⁽²⁾]  (8)

initialized as G₁ ⁽¹⁾=[1] and G₁ ⁽²⁾=[1].

II. If (G⁽¹⁾, G⁽²⁾) is a GCC, then so are (G⁽²⁾, G⁽¹⁾, (±G⁽¹⁾, ±G⁽²⁾), (G^((1)†), G^((2)†)) and (D^(k) ⁰ G⁽¹⁾, D^(k) ⁰ G⁽²⁾). Proposed GCC-Based Timing Offset Estimator

The suggested GCC-based timing and CFO estimator, illustrated below is referred to as Golay Estimator (G-EST). The transmitter (Tx) repeatedly (at low duty cycle) launches a training sequence (TS) of length N_(TS), of the form:

g≡{g[k]} ₀ ^(N) ^(TS) ⁻¹ =D ^(L) ^(edg) G _(L) ⁽¹⁾ +D ^(L) ^(edg) ^(+L+L) ^(c) G _(L) ⁽²⁾  (9)

where

2L _(edg) +L _(c)+2L=N _(TS)  (10)

Thus, the proposed TS 40 (FIG. 2) includes of the two codewords (51, 52) of the GCC-pair embedded in three zero-padding intervals (41, 42 and 43) of respective lengths L_(edg) L_(c), L_(edg), used as guardbands, to prevent or reduce overlaps of the various ACOR and XCOR terms (including the XCOR terms with the neighboring DFT-S OFDM data symbols).

The TS is positioned between data frames 61.

Note that we formally include the two null edge segments of length L_(edg) each, within the definition of the 4L-points (pnt) support of g, although the elements of these segments are null.

In the preferred implementations, the lengths of the TS as well as that of each GCC codeword are powers-of-two. Thus, to maximize the duty cycle of the GCC codewords within the overall TS, the following length constraints must be satisfied,

N _(TS)=4L,2L _(edg) +L _(c)=2L.  (11)

Implying that the TS has 50% duty cycle. It will be shown that the overall G-EST performance is somewhat sensitive to the “guardbands ratio” r_(GB) ≡L_(c)/L_(edg) determining the partition between the two guardband types; in the sequel we optimize over this ratio.

FIG. 2 also illustrates a training sequence 40″ that includes only first and second Golay codewords 51 and 52.

FIG. 2 also illustrates a training sequence 40′ that includes only the edge zero sequences but does not include any zeros between first and second Golay codewords 51 and 52.

We initially analyze the case of ideal noise-free and distortion-free transmission, without oversampling (the Rx samples at baudrate). Singling out a particular TS, we assume a lone TS has been transmitted, preceded and followed by data frames. Initially let us ignore channel impairments, both distortions (assume the sampled linear impulse response is impulsive and assume no nonlinear distortion and noise). Therefore the received sequence r[k] coincides with the transmitted sequence. Let us then express the received signal into the G-EST module as a juxtaposition of three components, namely {data, TS, data}:

r={r[k]} _(k=−∞) ^(∞) =d ⁻ ^(†) +g+D ^(4L) d ₊,  (12)

where d⁻, g, d₊ are zero-padded sequences, extending over all R, and we expressed the data subsequences, d⁻ ^(†), D^(4L)d₊ (data respectively preceding and following the TS) in terms of underlying causal sequences d⁻, d₊. We recall that the support of g has duration N_(TS)=4L (see (11)), ranging over {0, 1, . . . , 4L−1}.

The G-EST module cross-correlates the received signal against the transmitted TS, generating the following statistic:

ρ[k]=r[k]g[k]=r[k]{circle around (x)}g ^(†) [k]=r[k]{circle around (x)}g[−k]  (13)

Very Low-Complexity Implementation

This XCOR operation, referred to as the TS cross-correlator (TS-XCOR) may be simply realized in real-time by means of an FIR filter, with impulse response g[−k] (FIG. 7). Since the elements of the TS g[k]

are {0, ±1}, then this FIR filter is trivial to implement, as the tap multiplications reduce to signed additions, 2L of them per sample, as the taps corresponding to 0 are eliminated), Remarkably, the cross-correlator for proposed GCC based timing recovery scheme is multiplier free (unlike alternative D&C based timing recovery schemes which require one fast multiplier at the line rate to perform the correlations). A size 4L memory (buffer) is further required (e.g., for L=16 in our exemplary system).

The signal ρ[k] generated by the FIR filter (the TS-XCOR output) is split to feed the timing and CFO detector sub-modules. The timing detector consists of a peak position extractor, finding the peaks of the absolute value (squared) of the cross-correlation of the streaming received signal and the Golay Training sequence (G-TS)

The output ρ[k] will be shown to consist of a distinct single-sample peaks, corresponding to the G-TS locations, embedded in some low-level sidelobes. Assuming single shot TS transmission there is a single peak, but as the G-TS is periodically repeated, say every several hundred frames, there will be repeated peaks indicative of the G-TS positions.

Timing Detector

The timing detector may be robustly realized by binary decisions with a certain threshold onto the absolute value |ρ[k]| samples of the TS-XCOR output sequence (or alternatively the absolute-value squared, which may be easier to evaluate). The discrete-time instants when the absolute value of the correlator output exceeds the threshold are declared as timing estimates, indicative of the positions of the TS embedded within the data stream. Other peak finding algorithms are possible.

This completes the description of the GCC based timing estimator, the signal analysis of which is carried out next.

Signal Analysis

Using (12), the TS-XCOR output is expressed as:

ρ=rg=d ⁻ ^(†) g+gg+D ^(4L) d ₊ g  (14)

The presence of guardbands now implies that the supports of the three terms in the RHS of (14) are just partially or not at all overlapping, provided the autocorrelation lag (argument) is not taken to have excessive value. To begin with, let us evaluate the main term gg by expressing the GCC TS

as per (9) and using (5) and (3), yielding for the TS ACOR term:

$\begin{matrix} {{g\; \bullet \; g} = {{G_{L}^{(1)}\bullet \; G_{L}^{(1)}} + {G_{L}^{(2)}\bullet \; G_{L}^{(2)}} + {D^{- {({L_{mid} + L})}}G_{L}^{(1)}\bullet \; G_{L}^{(2)}} + {D^{L_{mid} + L}\left( {G_{L}^{(1)}\bullet \; G_{L}^{(2)}} \right)}^{\dagger}}} & (15) \end{matrix}$

Thus, the Rx additively superposes the sum of autocorrelations with the sum of the cross-correlations. It is also useful to visualize this result graphically (FIG. 1), inspecting (14) imaging that the TS is slid over itself at varying lags, while having the inner product (multiply and add) generated. At small lags, we first have each Golay codeword overlap with itself (with the corresponding codeword in the received data), while generating the sum of the two autocorrelations, then eventually we have one correlating Golay codeword overlap with the other, while having the other correlating codeword starts overlapping with the data.

Now, by virtue of the complementary ACORs property (7), the last equation yields a key result for the TS ACOR:

gg=2Lδ[k]+D ^(−(L) ^(mid) ^(+L)) G _(L) ⁽¹⁾ G _(L) ⁽²⁾ +D ^(L) ^(mid) ^(+L)(G _(L) ⁽¹⁾ G _(L) ⁽²⁾)^(†)  (16)

The useful term, used to extract the timing is evidently the impulsive peak 2Lδ[k], whereas the XCOR terms yield a background of sidelobes, which will be shown to be at relatively small levels in comparison with the peak. As the supports of G_(L) ⁽¹⁾G_(L) ⁽²⁾ and its reflection are symmetric around the origin: {−(L−1), . . . , −1, 0, 1, . . . , L−1} and the centers of the two cross-terms in the last expression are offset ±(L_(mid)+L) from the origin, having respective supports symmetrically positioned around the origin: {L_(c)−1, L_(c), . . . , L_(c)+2L−1} and {−(L_(c)+2L−1), . . . , −L_(c), −(L_(c)−1)}.

Thus, the gap separating the two cross-term supports equals

[L−1]−[−(L _(c)−1)]−1=2L _(c)−3.  (17)

It is within this gap that the peak 2Lδ[k] is embedded, (for L_(c)<2 there is no gap at all, but even for L_(c)=0, 1 the peak typically dominates the value of the sum of autocorrelation terms at zero lag). Having evaluated gg and shown that it is essentially impulsive, we must also consider the propagation of the two additional DATA× TS terms in (13) via the TS-XCOR. It turns out that the response due to these terms at the cross-correlator output may have partial or no overlap with the XCOR terms of gg, as may be seen by explicitly evaluating these DATA×TS terms:

$\begin{matrix} {{\begin{matrix} {{d_{\_}^{\dagger}\bullet \; g} = {d_{\_}^{\dagger}{\bullet \left( {{D^{L_{edg}}G_{L}^{(1)}} + {D^{L_{edg} + L + L_{c}}G_{L}^{(2)}}} \right)}}} \\ {= {{D^{- L_{edg}}d_{\_}^{\dagger}\bullet \; g} + {D^{- {({L_{edg} + L + L_{c}})}}d_{\_}^{\dagger}\bullet \; g}}} \end{matrix}{and}}\mspace{14mu}} & (18) \\ \begin{matrix} {{D^{4L}d_{+}\bullet \; g} = {D^{4L}d_{+}{\bullet \left( {{D^{L_{edg}}G_{L}^{(1)}} + {D^{L_{edg} + L + L_{c}}G_{L}^{(2)}}} \right)}}} \\ {= {{D^{{4L} - L_{edg}}d_{+}\bullet \; g} + {D^{{3L} - L_{edg} - L_{c}}d_{+}\bullet \; G_{L}^{(2)}}}} \end{matrix} & (19) \end{matrix}$

Simulations 201 and 203 of FIG. 3 will reveal that impulsive peak dominates over both these terms, as well as over the cross-terms previously evaluated in (15). As the data is assumed white, then its XCOR with each of the Golay sequences yields incoherent buildup to relatively low values; Moreover, the XCOR of the two Golay codes also appears pseudo-random, hence the ±1 terms to be added in its evaluation, yield a small value, as the number of pluses and minuses is roughly balanced. Thus, the TS XCOR output yields a distinct peak, surrounded by very low values (due to the various XCOR terms between the two sequences and among the two sequences and the data, as well as due to noise). The peak is generated when the received TS is aligned with the TS copy stored in the cross-correlator taps. When the two TS are unsynchronized by even one sample, the output drops significantly, enabling good discrimination of the peak.

Effect of Channel Memory (Delay Spread)

Heretofore, we have assumed that the discrete-time impulse response h[k] of the linear optical channel is impulsive, h[k]∝δ[k] where cc denotes proportionality. This “digital” impulse response is related to the analog impulse response h_(a)(t) by h[k]=h_(a)(kT_(s)) where T_(s) is the sampling interval.

The analog impulse response does not have to be an impulse, δ[k] in order for the digital impulse response to be a discrete-time impulse; rather, the support of h_(a)(t) must satisfy support{h_(a)(t)}<T_(s) i.e., the analog delay-spread must be less than a sample interval. This indicates that the proposed method would work best when this condition is satisfied, but it must not be strictly satisfied, as discussed next. Let us now assume that we have support{h_(a)(t)}≧T_(s), i.e. the delay spread exceeds the sampling interval. Now, h[k] contains several non-zero samples. A model taking into account non-impulsive h[k] is readily formulated. The received signal is r[k]=s[k]{circle around (x)}h[k] where s[k] is the overall transmitted signal (TS embedded in data). Eq. (12) is now replaced by S={s[k]}_(k=−∞) ^(∞)=d⁻ ^(†)+g+D^(4L)d₊, thus the received discrete-time sequence is given by

r=h{circle around (x)}d ⁻ ^(†) +h{circle around (x)}g+h{circle around (x)}D ^(4L) d ₊  (20)

The output of the TS-XCOR is then given by

$\begin{matrix} {\rho = {{r \cdot g} = {{r \otimes g^{\dagger}} = {{\left( {{h \otimes d_{-}^{\dagger}} + {h \otimes g} + {{h \otimes D^{4L}}d_{+}}} \right) \otimes g^{\dagger}} = {{{h \otimes d_{-}^{\dagger} \otimes g^{\dagger}} + {h \otimes g \otimes g^{\dagger}} + {{h \otimes D^{4L}}{d_{+} \otimes g^{\dagger}}}} = {{h \otimes \left( {d_{-}^{\dagger} \cdot g} \right)} + {h \otimes \left( {g \cdot g} \right)} + {h \otimes \left( {D^{4L}{d_{+} \cdot g}} \right)}}}}}}} & (21) \end{matrix}$

Now the cross-terms are convolved with the channel impulse response (which does not appreciably modify them as these terms are essentially pseudorandom) but more significantly, the useful term, containing timing information within (21) is now (using (16)) given by:

$\begin{matrix} {{h \otimes \left( {g \cdot g} \right)} = {{2{Lh}} + {D^{- {({L_{mid} + L})}}{h \otimes \left( {G_{L}^{(1)} \cdot G_{L}^{(2)}} \right)}} + {D^{L_{mid} + L}{h \otimes \left( {G_{L}^{(1)} \cdot G_{L}^{(2)}} \right)^{\dagger}}}}} & (22) \end{matrix}$

Within this term the dominant sub-term is 2Lh, which replaces the 2Lδ term in the memoryless channel case. This indicates that when the discrete-time channel has memory, the Golay timing estimation method actually yields for channel impulse response identification. This may still result in a timing estimate. E.g., in case h[k] peaks at k=0, then finding the maximum may still provide timing information. Notwithstanding this analysis, which indicates that the proposed method may still work for channels with memory, here we are primarily interested in filter-bank based receivers, wherein the proposed timing recovery method is applied on a sub-band basis. In this case the sampling rate of each relatively narrowband sub-band is substantially lower than the sampling rate of the overall channel, hence the condition support{h_(a)(t)}<T_(s) typically holds, making timing recovery more robust for a sub-bands based receiver, than for a full channel which may reveal multiple densely taken samples of the non-impulsive impulse response h[k].

It is further evident that the proposed Golay based timing recovery is also suitable for channels which exhibit discrete-multipath (channel impulse response consists of a superposition of multiple impulses with various delays and amplitudes).

III. Twice-Oversampled Golay Timing Recovery

Heretofore the analysis was conducted for a baud-rate sampled Rx. Consider now a K-fold fractionally oversampled Rx, in the simplest case a twice-oversampled (2×OS) Rx. A simple conceptual block diagram for the transmission chain comprises an interpolator (a K-fold up-sampler followed by shaping filter) and by an ideal DAC at the Tx, an analog channel, an discrete-time equivalent channel, represented as a linear time invariant analog filter with impulse response h_(a)(t), an ideal ADC followed by a decimator (a V-fold down-sampler preceded by an anti-aliasing filter). The DAC and ADC operate at the elevated sampling rate of V times the symbol rate.

Thus, the cascade ADC->analog_channel_filter->DAC amounts to a discrete time channel h_(c)[n]=h_(d)(nT_(c)) where f_(s)=T_(s) ⁻¹ is the sampling rate at the Tx input and Rx output; the sampling rate within the channel is K times larger,

f _(c) =T _(c) ⁻¹=(T _(s) /K)⁻¹ =Kf _(s)

Now suppose that the sampled impulse response satisfies:

h[k]=h _(a)(kT _(s))=h _(a)(kKT _(c))=h _(c) [kK]∝δ[k]   (23)

A sufficient condition for this is that the analog impulse response satisfy support {h_(a)(t)}<T_(s), however this is not necessary. Indeed, if the zero-crossings of h_(a)(t) occur at regular intervals, {kT_(s)}_(k≠0) then (23) is satisfied even when the support of h_(a)(t) is arbitrarily large. In digital terms it is apparent that K-fold sub-sampling h_(c)[n] should yield a response h[k]≡h_(c)[kK] which is essentially a discrete impulse. Under these conditions the K-fold up-sampler and K-fold down-sampler are essentially back-to-back and cancel out, thus the overall transmission chain becomes an identity. Thus, the transmitted baud-rate symbols get reconstructed at the receiver. To the extent that the Nyquist condition (23) is not strictly satisfied then there will be Inter-Symbol Interference (ISI). However if the taps of the impulse response h[k] are not precisely zero for k≠0 but are close to zero, then the ISI is small and in the current context it may modify the complementary ACOR sidelobes somewhat but hardly affect the distinct mainlobe peak which may still remain dominant.

Our proposed Golay timing method was tested to operate using a receiver which oversamples the data by the factor of two, as the filter bank receiver does within each sub-band, provided suitable measures are taken as described below. We also propose an extension for an oversampling receiver which operates with any integer oversampling ratio by simply cross-correlating each of the polyphases of the received K-fold oversampled signal and selecting the XCOR with largest mainlobe. Another case when there may be oversampling used, hence the proposed Golay based timing method is applicable, is for fractionally-equalized single-carrier receiver using a receiver sampling rate which is an integer multiple of that of the transmitted symbol rate.

For definiteness we describe how any twice-oversampled receiver may be equipped with a Golay based timing estimator. FIGS. 4 and 5 describe the case of twice-oversampled filter-bank transmission [Nazarathy, SPM'14] but the structure of the module “Golay-CTO&FTO Estimator” is generically applicable to any twice-oversampled receiver, e.g. a single carrier receiver with sampling rate equals to twice the baudrate, provided the support of the sampled channel impulse response is a single sample. The Tx in FIG. 4 corresponds for definiteness to a DFT-S OFDM transmitter [Shieh, COIN'10]. The receiver front-end corresponds to one path of a twice-oversampled filter-bank. The receiver receive, by a serial to parallel module 250, a sub band data stream that is fed to a FFT (such as a 64 tip FFT 260) that is followed by a much larger IFFT (such as a 1024 tip IFFT 270) that is followed by a root cosine shaping filter 280 that is followed by channel 190. The IFFT outputs OFDM symbols that are up-sampled by a factor of sixteen. The signals outputted by IFFT 102 are referred to in FIG. 6 as digital data samples 401 that are double sampled (sampling points 402) and the output signals of the root cosine shaping filter 280 are referred to in FIG. 6 as analog data samples 404. The analog filter widens the digital data samples so that both odd and even symbols may have a nonzero value.

FIG. 5 illustrates an OFDM receiver that includes an input port for receiving the transmitted symbols from channel 290, a root raised cosine shaping filter 310, a filter 320 of a filterbank that passes symbols of frequencies that correspond to the sub-band, a down-sampler (for example a 1:8 downsampler 330) a Golay-CTO&FTO Estimator 301 and other receiver sub-band circuits 380.

Golay-CTO&FTO Estimator 301 is configured to separate the stream of OFDM symbols that is twice-oversampled separated into even and odd OFDM symbols (that form odd and even polyphase subsequences), by means of a 1:2 serial-to-parallel (S/P) module (a two-state commutator) 250. The even and odd polyphase sequences are input into respective Golay-Training-Sequence Cross-correlators (G-TS-XCOR) modules 100. These resulting cross-correlation streams are input in the coarse timing estimator 360 that may choose the largest-time index, which selects the largest time index both over time and between the two XCOR sequences. Thus, the highest XCOR mainlobe (peak) of the two is selected (typically the XCOR mainlobes of the even and odd sequences will occur at the same time index (or one index off), and the index corresponding to the largest of the two XCOR peaks provides the CTO estimate. We conclude that the CTO timing is determined by the stronger G-TS-XCOR peak in the two even and odd data sub-streams.

Curves 421 m 422, 423 and 424 of FIG. 7 plot the mainlobes and maximal sidelobes of the two G-TS-XCOR modules as a function of the fractional delay in the channel.

FIG. 7 indicates that the relative sizes of the mainlobe peaks are determined by the Fractional Timing Offset (FTO) in the sub-band sub-channel (group delay per sub-band modulo the sampling interval, thus the FTO is fractional delay, less than one sample interval). The relative sizes of the two mainlobes can be used to extract an FTO estimate (to be used in setting up linear phase vs. frequency slope in the 2×2 MIMO equalizer) also useful in counteracting Sampling Frequency Offset (SFO), i.e. a discrepancy between the sampling clocks in the transmitter and receiver, which shows up as slowly increasing or decreasing FTO.

To describe the internals of the fractional timing offset estimator 370, one may develop an analytic expression or numerical graph for the difference between the two XCOR peaks of the even and odd polyphases as a function of the FTO and this functional dependence may be inverted to obtain an FTO estimate as a function of the difference in absolute values or absolute values squared of the two mainlobe peaks. The FTO estimate may then be used to set up the linear phase vs. frequency index slope in the 2×2 MIMO equalizer of the sub-band receiver.

However, typically it suffices to obtain an odd function of the difference between the absolute squares of the two peaks, which describes an odd-valued function of the FTO in order to effectively stabilize the FTO by feeding a PLL-like loop controlling the ADC sampling clocks by means of the difference between the absolute squares of the two peaks.

The timing estimator described in the sequel is then proposed for the following generic oversampled system with under-decimation as depicted in FIG. 5. The under-decimated digital equivalent chain consists of a K-fold upsampler, an effective channel filter h_(c)[n] and a down-sampler

The proposed GCC-based timing estimator consists of a 1:K serial-to-parallel (S/P) module, followed on each of its parallel arms by a TS-XCOR module, performing the correlation with the GCC-based TS at the slowed down sampling rate (at baud-rate). The outputs of all cross-correlators are compared and the path with the highest absolute value of cross-correlation is selected (first the peak over discrete time is selected for each path, then the highest absolute value is selected among all polyphases (S/P outputs)). Thus, this method may be described as per-polyphase-cross-correlation.

In the special case that K=2 (twice-under-decimated system) then the 1:2 S/P effectively extracts the even and odd sub-sequences of the K-down-sampled output and baud-rate TS-XCORs are performed on each of these two sub-sequences. In this case, it is possible to estimate the fractional delay of the channel from the relative levels of the mainlobes of even and odd TS-XCORs. As shown by the simulation of FIG. 7, as the fractional delay τ is scanned from zero to one full interval T_(c), then the levels of the even and odd TS-XCORs vary in commentary fashion. These two levels become balanced (equal) when τ=T_(c)/4, i.e. when the normalized fractional delay, {circumflex over (τ)}≡τ/T_(c) attains the value 0.25; when {circumflex over (τ)}=0, it is the even TS-XCOR that peaks up, whereas when {circumflex over (τ)}=0.5, it is the odd TS-XCOR that peaks up. An estimator for the fractional delay may be set up by taking the difference of the two absolute values (or absolute values squared) of the two autocorrelations. This difference will be null for {circumflex over (τ)}≡0.25, whereas it becomes positive with maximal absolute value for {circumflex over (τ)}≡0.5 and negative with minimal absolute value for {circumflex over (τ)}≡0. If a means is provided to adjust the fractional delay (e.g. an interpolation filter, or a frequency-domain equalization system) then it is possible to feed the estimated fractional delay into that system. In particular, a frequency-domain based adaptive equalizer may be initialized with the phase step corresponding to the estimated fractional delay, in order to provide faster convergence of the adaptive algorithm.

Simulation Results for Golay Timing of Twice-Oversampled Rx

Graphs 203, 204, 205, 206, 205′, 206′ of FIGS. 12-14 present simulations of the XCOR outputs for sub-banded optical coherent transmission over 1200 km of standard single mode fiber (SSMF) with dispersion parameter β₂=−2110⁻²⁴ [sec²/km] accounting for the chromatic dispersion of a single polarization. Additional parameters are Carrier Frequency Offset of 5 MHz, L_(c)=12, 13 dB OSNR, 100 kHz laser linewidth. 64 OFDM sub-carriers were used per sub-band, occupying the sub-band bandwidth of

$\frac{25}{15}$

[GHz]. For our novel training sequence 32 tones out of the 64 were used as two Golay sequences as described before, whereas the rest of the tones were zeros as per Eq. (2).

These simulations were conducted for the twice-oversampled receiver of FIG. 5, selecting for presentation, between the XCORs of the even and odd polyphases the one with higher mainlobe.

The three figures differ in their usage of linear or dB scale and in the horizontal range. It is apparent that the Minn algorithm timing peak is not very sharp and one can images that values slightly off-peak may be affected by noise spikes such that they are mistaken for the peak. Another notable effect is that the Golay sidelobes are depressed in the local spectral environment of the peak facilitating peak discrimination.

FIG. 15 shows an accumulation of all values of the mainlobes and sidelobes of the XCORs of the even and odd polyphases for repeated transmissions of the G-TS (as a function of the TS index). It is apparent that in this particular case it is the XCOR odd polyphase that has a higher mainlobe to sidelobe ratio (and a higher mainlobe, relative to the XCOR of the even polyphase).

IV. CFO Tolerance

In this sub-section we consider the tolerance of the proposed timing estimator to the presence of Carrier Frequency Offset (CFO). In this case a simple analytical model may be set up, indicating that CFO reduces the level of the correlation mainlobe (peak) and accurately predicting the dependence of the peak level on Δv_(CFO). It turns out that modification of the sidelobes level is quite small in the wake of CFO, as borne by simulations, hence the CFO tolerance is essentially determined by the peak level roll-off with increased Δv_(CFO). First we present the result of CFO tolerance simulations (FIG. 16—graph 212). It turns out that the dependence of the peak value on Δv_(CFO) is given by a “dinc×cos” functional profile, where the “digital sinc” (dinc) function is the Dirichlet kernel,

$\begin{matrix} {{{Dinc}_{D}\lbrack u\rbrack} \equiv \frac{\sin \left( {\pi \; u} \right)}{D\; {\sin \left( {\pi \; {u/D}} \right)}}} & (24) \end{matrix}$

The precise mainlobe expression is:

$\begin{matrix} {\left( {2L} \right)^{2}{{{\cos \left( {\theta_{CFO}\frac{1}{2}\left( {\frac{N}{4} + L_{C}} \right)} \right)} \cdot {{Dinc}_{\frac{N}{4}}\left( \theta_{CFO} \right)}}}^{2}} & (25) \end{matrix}$

To understand this result intuitively, consider first what occurs at the moment of alignment of the transmitted and received G-TS sequences (FIG. 8), assuming no CFO. As the Golay codewords are bipolar, ±1, then at the instant of alignment, the ±1 are multiplied by ±1, respectively and added up, yielding the L+L=2 L values of unity, i.e. a mainlobe of 2 L. Now, if there is CFO, the analytical model yielding the product “dinc×cos” of the two functions may be derived, as indicated in FIG. 8. Now, the outputs of the FIR tap multipliers 132 and 131 in the Golay XCOR structure of FIG. 10 are no longer all ones, but due to the CFO (linear phase ramp on the received data), the outputs of the FIR multipliers are

$^{{j\theta}_{CFO}},^{{j2}\; \theta_{CFO}},\ldots \mspace{14mu},{^{j\frac{N}{4}\theta_{CFO}}\left( {{{with}\mspace{14mu} \theta_{CFO}} = {2{\pi\Delta}\; v_{CFO}T_{s}}} \right)}$

for the first Golay codeword and

$^{{j\theta}_{CFO}{({\frac{N}{4} + L_{C}})}}\left( {1,^{{j\theta}_{CFO}},^{{j2\theta}_{CFO}},\ldots \mspace{14mu},{^{j\frac{N}{4}}\theta_{CFO}}} \right)$

for the second Golay codeword. Summing all the complex phase factors and taking the absolute value squared yields the result of (25).

One operational conclusion from this expression for the mainlobe power degradation as a function of the CFO, is that it is worth decreasing the gap L_(c) between the two Golay codes in order to make the cosine function roll-off more mildly (graphs 214 and 215 of FIG. 17). The tradeoff is that the cross-correlation term between the two Golay codewords now starts overlapping with portions of the previously null regions but overall we may be better off in mainlobe to sidelobes ratio by taking a smaller gap L, smaller, such that the roll-off in the sidelobes is smaller, although the sidelobes may be somewhat increased.

V. Tolerance to Additive White Noise (Amplified Spontaneous Emission) and to Laser Phase Noise.

AWGN (Additive White Gaussian Noise)

In D&C methods are based on a cross-correlation between parts of the received frame. A result of one multiplication between samples of time k and k−L the are three noise terms:

${{\left( {{x\lbrack k\rbrack} + {n\lbrack k\rbrack}} \right) \cdot \left( {{x\left\lbrack {k - L} \right\rbrack} + {n\left\lbrack {k - L} \right\rbrack}} \right)^{*}}{{n\lbrack k\rbrack} \cdot {n^{*}\left\lbrack {k - \frac{N}{4}} \right\rbrack}}};$ ${{x\lbrack k\rbrack} \cdot {n^{*}\left\lbrack {k - \frac{N}{4}} \right\rbrack}};{{{x^{*}\left\lbrack {k - \frac{N}{4}} \right\rbrack} \cdot {n\lbrack k\rbrack}}{\left( {{x\lbrack k\rbrack} + {n\lbrack k\rbrack}} \right) \cdot {g\lbrack k\rbrack}}}$

When applying the suggested method there is only one noise term g[k]*n[k].

On the other hand just half the frame is filled up.

For low SNR the term n[k]n*[k−L] is dominant

The tolerance to white noise is about the same as for the Minn algorithm, with an advantage for Golay timing in low SNR (in which case the noise×noise term, present in Minn but not in Golay, becomes more pronounced). The reason the two methods have the same white noise tolerance in high SNR is that on one hand Minn presents two noise×signal terms (contributing equal noise powers), whereas Golay presents a single such term, but on the other hand Golay is sparse, with just half the frame being filled up, hence there is a factor of two less noise averaging in the Golay case, which offsets the fact that it comprises a single noise×signal term. In detail, the conjugate products arising in the Minn case are of the form

$\left( {{x\lbrack k\rbrack} + {n\lbrack k\rbrack}} \right) \cdot \left( {{x\left\lbrack {k + \frac{N}{4}} \right\rbrack} + {n\left\lbrack {k + \frac{N}{4}} \right\rbrack}} \right)^{*}$

which yields 3 noise terms (for each multiplication):

${{n\lbrack k\rbrack} \cdot {n\left\lbrack {k + \frac{N}{4}} \right\rbrack}^{*}};{{x\lbrack k\rbrack} \cdot {n\left\lbrack {k + \frac{N}{4}} \right\rbrack}^{*}};{{x\left\lbrack {k + \frac{N}{4}} \right\rbrack}^{*} \cdot {{n\lbrack k\rbrack}.}}$

Our proposed method uses an internal noise free sequence, g[n], thus one multiplication of the correlation process becomes (x[k]+n[k])g[k] yielding only one noise term: g[k]n[k].

As for tolerance to laser phase noise (LPN) (graph 216 of FIG. 18), it turns out that the system is relatively insensitive to the laser linewidth (the level of phase noise), even for large levels of LPN.

VI. Golay Based CFO Estimation

The CFO active estimation using the Golay based system is related to the analysis of CFO degradation analysis. We have seen that at the instant when the correlation peak occurs, the received time samples {−L/2−1, . . . , −1, 0, 1 . . . L/2−1} are perfectly aligned with the TS version stored in the taps of the cross-correlator. If there were no noise and distortion, then the received signal would equal the transmitted signal, i.e., the TS. The 2L non-zero taps of the FIR filter implementing the TS XCOR, would then invariably generate (±1)(±1)=1. Thus we would obtain two records of all-ones, each of length L, corresponding to the two GCCs. However, if there is CFO, then the received TS over the support of L points corresponding to each GCC, would consist of successive samples (±1) multiplied by the CFO time-domain phase-ramp factor e^(jθ) ^(CFO) ^(k), where

θ_(CFO)=2πΔv _(CFO) T _(s)=2πΔv _(CFO) /f _(s)   (26)

thus, the received sequence would be (±1)e^(jθ) ^(cfo) ^(k) where the signs correspond to the Golay codeword. When this sequence gets aligned with and multiplied by the Colay codeword stored in the taps of the TS-XCOR of the receiver, this yields

(±1)e ^(jθ) ^(CFO) ^(k)(±1)=e ^(jθ) ^(CFO) ^(k)   (27)

Thus, at the optimal timing instant, we ideally obtain a constant (unity in this example) amplitude sequence with a phase tilt. To determine the CFO, it remains to estimate the phase increment per sample, θ_(CFO), as then (26) may be used to extract

Δv _(CFO) =f _(s)(θ_(CFO)/2π)  (28)

Various methods may be used to extract the CFO phase increment, e.g. one may evaluate the phase angles of the TS-XCOR samples at the alignment time (L samples associated with the two respective Golay codewords, separated by L_(edg) samples), by inputting each complex sample into an angle extractor (e.g. the CORDIC algorithm) and “passing a line”, based on a least-squares solution, through the measured phases, albeit after the raw measured phases have been unwrapped. An alternative preferred method is to divide each of the L points into S subsets of L/S points (where L/S is integer). Each such subset is averaged (its samples are summed up and divided by L/S) and the sub-sets are grouped in successive pairs and the phase of each even subset is compared with that of the odd subset within that pair, i.e. the phase difference is extracted, e.g. by taking conjugate products and evaluating the phase angle, or alternatively by evaluating the phase-angles and subtracting.

In the particular case that S=L, then the “subsets” becomes singletons (single samples). In this case the phase of each complex odd sample is subtracting from its preceding odd sample, and the phases are averaged out. It is also possible to further average over the average phases corresponding to the two Golay codewords and in a sub-banded filter bank based context to further average across sub-bands, as all sub-bands are assumed to be affected by common CFO. This method requires long averaging and is relatively impervious to laser phase noise as the subtraction of successive phases of the subgroups effectively whitens the phase noise, hence the whitened phase noise is amenable to averaging. The longer the averaging window or the more windows are averaged at the top level (averages of averages) the better the laser phase noise and white noise tolerance of the CFO estimator. Another measure which may improve laser phase noise tolerance in the sub-banded context, is to use “non-redundant interleaving” of the sub-bands as described in the sequel of this patent application.

Comparison with Delay&Correlate Methods Such as Minn

The operation principle of CFO estimation using D&C methods is by taking the angle of the correlation, from which the CFO can be estimated. Denoting s [n] as the complex received symbols, the correlation of

$\frac{N}{2}$

identical parts of the frame is given by:

$\begin{matrix} {\sum\limits_{n = 0}^{\frac{N}{2} - 1}{{s^{*}\lbrack n\rbrack} \cdot {s\left\lbrack {\frac{N}{2} + n} \right\rbrack}}} & (29) \end{matrix}$

Thanks to the conjugate operation the common phase of the entire frame is cancelled, and the phase rotation due to CFO remains. The phase rotation is proportional to the time difference between the two samples s[n] and

${s\left\lbrack {\frac{N}{2} + n} \right\rbrack},$

which is

$\frac{N}{2} \cdot T_{s}$

regardless of n (T_(s) is the receiver sample time). CFO estimation is possible by finding the phase of the vector

$\sum\limits_{n = 0}^{\frac{N}{2} - 1}{{s^{*}\lbrack n\rbrack} \cdot {s\left\lbrack {\frac{N}{2} + n} \right\rbrack}}$

in the I-Q plane:

$\begin{matrix} \begin{matrix} {{\bullet \left( {\sum\limits_{n = 0}^{\frac{N}{2} - 1}{{s^{*}\lbrack n\rbrack} \cdot {s\left\lbrack {\frac{N}{2} + n} \right\rbrack}}} \right)} = {\bullet\left( {\sum\limits_{n = 0}^{\frac{N}{2} - 1}^{{- {j2\pi}}\; f_{CFO}\frac{N}{2}T_{S}}} \right)}} \\ {= {2\pi \; f_{CFO}\frac{N}{2}T_{S}}} \end{matrix} & (30) \end{matrix}$

where the last equality is true in case there is no noise (f_(CFO) denotes the CFO). The extraction of the CFO is done by dividing equation (3) by the factor

$2\pi \frac{N}{2}{T_{S}.}$

Dividing by

$\frac{N}{2}$

has a beneficial outcome as it reduces the variance of the noise, yet it also reduces the dynamic range of the CFO estimation. The largest positive phase that can be detected is π, which leads to the following equality:

${2\pi \; f_{CFO}^{Max}\frac{N}{2}T_{S}} = {\left. \pi \Rightarrow f_{CFO}^{Max} \right. = {\frac{1}{N} \cdot \frac{1}{T_{S}}}}$

The largest N is, the smaller f_(CFO) ^(Max) becomes (smaller dynamic range). D&C methods have a fix dynamic range.

At the moment of perfect alignment between the transmitted training sequence and the internal training sequence, the multiplication of the two frames results only in the values of ±1. We can choose using any set of the identical symbols, which implies having a much more flexible dynamic range of the CFO estimation, than in Minn's or S-C's. Controlling the number of symbols,

$\frac{N}{2}$

in (5), gives freedom in compromising between dynamic range and noise averaging.

We note that our timing correlation is less resistant to CFO the Minn's, since we use an internal frame and not correlating two halves of a received frame, yet it shows one distinct peak even for CFO values of 100 [MHz], as will be elaborated next.

Golay Based Methods of CFO Estimation

FIG. 19 describes two equivalent realizations of the phase detector (PD) building block, which is a module extracting the relative phase between its two input samples—a complex domain PD 281′ and a angular domain PD 282″.

FIGS. 11 and 20-32 are CFO estimation circuits 101-114 according to various embodiment of the invention. The circuits include a shift register 120, first Golay codeword tips 131, second Golay codeword tips 132, first phase detectors 181, second phase detectors and averaging circuits 190.

In FIGS. 11- and 23-24 the two Golay codewords are used to estimate the CFO separately and then the results are averaged.

-   -   L denotes the difference in the indexes of the two samples fed         to the PDs     -   AVG 190 denoting an (arithmetic) averaging module, namely sum         and divide by the number of inputs.     -   The number of PDs required decreases with L, as the following         condition holds: L+#PD=8. Here #PD is number of PD required per         record.

Next, figures illustrate 25-27 CFO estimation circuits in which the Golay codewords G1 and G2 records are combined together for the CFO estimation

-   -   M denotes here the relative difference between the positions         (inside each record) of the samples fed into the PD. The         following condition holds: M+#PD total=9

FIGS. 28-32 include AVG prior to the PD.

-   -   The number of AVG-s can differ.     -   As in the first CFO estimation part the numbers of PD-s can be         different according to the separation between the AVG-ed         sub-records.

These multiple embodiments demonstrate that the Golay based timing has large leeway in how to organize the extraction of the incremental phases. This is in contrast with Minn CFO estimation which is quite constrained to operate with strict separations for the PDs.

The various embodiments 20-32 represent various tradeoffs between ASE and laser phase noise performance, CFO dynamic range and realization complexities.

VII. Polarization-Diverse Operation

It suffices to launch the G-TS in just one state of polarization (say along the X or Y polarization axis) in the Tx, since the polarization transformation along the fiber implies that the powers and relative phases of the two received G-TS components in the two polarizations (POL), will be randomized anyway. To attain resilience of the timing to the POL evolution in the fiber, it is proposed to use a POL-diversity technique, whereby in each sub-band that G-TS are launched is received at the filter-bank outputs for both POL corresponding to the particular sub-band and coherent combining of the two XCORs is performed in order to increase the SNR prior to determining the position of the peak (mainlobe).

The proposed novel timing offset estimation is manifested in a highly distinct correlation peak, impervious of channel impairments, allows for CFO estimation over a large dynamic range and outperformance previous method in the sense of the clear peak it displays and the AWGN resilience. The implementation requires no multipliers which is a benefit in both area and cost. To recap the Golay based channel estimation algorithm advantages:

-   -   extremely low HW complexity—just adders for timing, less         multipliers for CFO—training sequence of same sizes and duty         cycles as Minn     -   much better mainlobe-to-sidelobes ratio than Minn-sharp lone         peak     -   can also cope with (and even estimate) fractional timing delays         and be used with 2×oversampling

for filter-bank receivers

-   -   tolerant to about 1-2% of sampling rate CFO.     -   More flexible at estimating CFO

than Minn and Schmidl-Cox, enabling a variety of tradeoffs between dynamic range, white noise and laser phase noise tolerances and complexity.

FIG. 33 illustrates method 400 according to an embodiment of the invention.

Method 400 may start by step 410 of include receiving a stream of OFDM symbols.

Step 410 may be followed by step 420 of searching, by a timing circuit, in the stream of OFDM symbols, for a training sequence that comprises a first Golay codeword and a second Golay codeword.

The sum of an autocorrelation of the first Golay codeword and an autocorrelation of the second Golay codeword consists essentially of a delta function. See, for example, FIG. 1. Step 420 may be followed by step 430 of processing, by a timing circuit, the training sequence and extracting timing information about a timing of reception of OFDM symbols, out of the stream of OFDM symbols, that convey data.

Step 420 may be followed by step 440 of calculating frequency offset.

Reference Golay Sequences

To estimate timing we are going to send 2 complimentary Golay sequences. Golay sequence can be built recursively in the following way:

A _(n) =[A _(n−1) B _(n−1) ]B _(n) =[A _(n−1) −B _(n−1)]

For example:

A_(n)=1, B_(n)=−1

A₂=[1 −1], B₂=[1 1]

A₃=[1 −1 1 1], B₃=[1 −1 −1 −1]

Transmitted Signal

We send the following training frame over all sub-band of the X polarization: [A₃₂B₃₂].

We send the following training frame over all sub-band of the Y polarization: [A₃₂−B₃₂].

Correlation Operation

During channel propagation X-polarization and Y-polarization are mixed. This is the reason we must deal with both polarization to get reliable timing estimation. FIG. 34 illustrates the following receiver structure 500 according to an embodiment of the invention.

The receiver includes square absolute value unit 502, running average module 504, multipliers 510, first adder 521, second adder 522, first square absolute value unit 531, second square absolute value unit 532, adder 540 and peak search unit 550.

Let's define the following values:

$A_{corr}\bullet {\sum\limits_{i = 1}^{32}{{A_{32}\lbrack i\rbrack} \cdot {d\left\lbrack {2i} \right\rbrack}}}$ $B_{corr}\bullet {\sum\limits_{i = 1}^{32}{{B_{32}\lbrack i\rbrack} \cdot {d\left\lbrack {64 + {2i}} \right\rbrack}}}$

To achieve that in order to achieve the perfect autocorrelation property of the Golay sequence we must add the correlation over different Golay sequences:

C _(X) □A _(corr) +B _(corr)

C _(Y) □A _(corr) +B _(corr)

But in real system the signal over different polarizations are multiplexed. We must perform correlation with X-pol sequence and Y-pol sequence at each receiver polarization. To reduce complexity we propose the following overall cost function:

Corr□C _(X) ² +C _(Y) ²=(A _(corr) +B _(corr))²+(A _(corr) −B _(corr))² =A _(corr) ² +B _(corr) ²

From the following equation it is evident that we only need single operation to get reliable correlation value over any polarization rotation angle.

Peak Search Block

Graph 610 of FIG. 35 present the expected correlation diagram results (different methods presented for comparison):

As we can see Golay based (PN sequence) based timing estimation gives as much clearer results compared to previously implemented Minn timing method.

Graphs 620, 630, 640 and 650 of FIGS. 36, 37, 38 and 39 also illustrates simulation results—without fractional delay, with small positive fractional delay, with half a sample fractional delay and with a small negative fractional delay respectively. When examining closely the peak of the resulting correlation function, different scenarios can be identified depending on the overall channel fractional (sub-sample) delay.

Peak detect mechanism should be able to give reliable peak reading in all these cases. The following algorithm is proposed: Choose the first peak higher than given threshold, and

The threshold value should be adaptive and be determined according to the average power of the input data signal.

Moreover, the terms “front,” “back,” “top,” “bottom,” “over,” “under” and the like in the description and in the claims, if any, are used for descriptive purposes and not necessarily for describing permanent relative positions. It is understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments of the invention described herein are, for example, capable of operation in other orientations than those illustrated or otherwise described herein.

Those skilled in the art will recognize that the boundaries between logic blocks are merely illustrative and that alternative embodiments may merge logic blocks or circuit elements or impose an alternate decomposition of functionality upon various logic blocks or circuit elements. Thus, it is to be understood that the architectures depicted herein are merely exemplary, and that in fact many other architectures may be implemented which achieve the same functionality.

Any arrangement of components to achieve the same functionality is effectively “associated” such that the desired functionality is achieved. Hence, any two components herein combined to achieve a particular functionality may be seen as “associated with” each other such that the desired functionality is achieved, irrespective of architectures or intermediate components. Likewise, any two components so associated can also be viewed as being “operably connected,” or “operably coupled,” to each other to achieve the desired functionality.

Furthermore, those skilled in the art will recognize that boundaries between the above described operations merely illustrative. The multiple operations may be combined into a single operation, a single operation may be distributed in additional operations and operations may be executed at least partially overlapping in time. Moreover, alternative embodiments may include multiple instances of a particular operation, and the order of operations may be altered in various other embodiments.

Also for example, in one embodiment, the illustrated examples may be implemented as circuitry located on a single integrated circuit or within a same device. Alternatively, the examples may be implemented as any number of separate integrated circuits or separate devices interconnected with each other in a suitable manner.

However, other modifications, variations and alternatives are also possible. The specifications and drawings are, accordingly, to be regarded in an illustrative rather than in a restrictive sense.

In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word ‘comprising’ does not exclude the presence of other elements or steps then those listed in a claim. Furthermore, the terms “a” or “an,” as used herein, are defined as one or more than one. Also, the use of introductory phrases such as “at least one” and “one or more” in the claims should not be construed to imply that the introduction of another claim element by the indefinite articles “a” or “an” limits any particular claim containing such introduced claim element to inventions containing only one such element, even when the same claim includes the introductory phrases “one or more” or “at least one” and indefinite articles such as “a” or “an.” The same holds true for the use of definite articles. Unless stated otherwise, terms such as “first” and “second” are used to arbitrarily distinguish between the elements such terms describe. Thus, these terms are not necessarily intended to indicate temporal or other prioritization of such elements The mere fact that certain measures are recited in mutually different claims does not indicate that a combination of these measures cannot be used to advantage.

While certain features of the invention have been illustrated and described herein, many modifications, substitutions, changes, and equivalents will now occur to those of ordinary skill in the art. It is, therefore, to be understood that the appended claims are intended to cover all such modifications and changes as fall within the true spirit of the invention. 

We claim:
 1. An orthogonal frequency division multiplexing (OFDM) receiver, comprising: an input port that is configured to receive a stream of OFDM symbols; a timing circuit that is configured to search, in the stream of OFDM symbols, for a training sequence that comprises a first Golay codeword and a second Golay codeword and to process the training sequence and extract timing information about a timing of reception of OFDM symbols, out of the stream of OFDM symbols, that convey data; wherein the sum of an autocorrelation of the first Golay codeword and an autocorrelation of the second Golay codeword consists essentially of a delta function.
 2. The OFDM receiver according to claim 1 wherein the first Golay codeword and the second Golay codeword are separated from each other by at least one padding bit.
 3. The OFDM receiver according to claim 1 wherein the first Golay codeword and the second Golay codeword are not separated from each other by any padding bits.
 4. The OFDM receiver according to claim 1 wherein at least ninety percent of energy of the sum belong to the delta function.
 5. The OFDM receiver according to claim 1 wherein the sum consists only of the delta function.
 6. The OFDM receiver according to claim 1 wherein the delta function has a peak that equals twice a length of the first Golay codeword.
 7. The OFDM receiver according to claim 1 wherein the timing circuit does not comprise multiplication units.
 8. The OFDM receiver according to claim 1 wherein the OFDM symbol stream comprises multiple interleaved sequences of oversampled data symbols; wherein each sequence of oversampled data symbols comprises a training sequence candidate; wherein the timing circuit is configured to select a selected training sequence out of multiple training sequence candidates of the OFDM sequence stream.
 9. The OFDM receiver according to claim 7 wherein the timing circuit is configured to calculate cross-correlations peaks by cross correlating between each of the multiple training sequence candidate and a reference training sequence that comprises the first Golay codeword and the second Golay codeword.
 10. The OFDM receiver according to claim 8 wherein the timing circuit is configured to select the selected training sequence in response to the cross-correlation peaks.
 11. The OFDM receiver according to claim 8 wherein the timing circuit is configured to select as the selected training sequence a selected training sequence candidate having a biggest cross correlation peak out of the cross correlation peaks.
 12. The OFDM receiver according to claim 8 wherein the timing circuit is configured to define a timing reference point as a location of the cross correlation peak of the selected training sequence.
 13. The OFDM receiver according to claim 8 wherein the timing circuit is configured to compare the cross correlation peak of the selected training sequence to a cross correlation peak of at least one training sequence candidate that differs from the selected training sequence to provide a comparison result; and to determine a fractional timing offset based upon the comparison.
 14. The OFDM receiver according to claim 7 wherein the OFDM receiver comprises a frequency offset determination circuit that is configured to determine a frequency offset of the OFDM sequence in response to a value of the cross correlation peak of the selected training sequence.
 15. The OFDM receiver according to claim 14 wherein the frequency offset determination circuit is configured to: divide the first Golay codeword of the selected training sequence into multiple first subsets; calculate first averages of cross correlations between the multiple first subsets and a corresponding reference First Golay codeword subsets; divide the second Golay codeword of the selected training sequence into multiple second subsets; calculate second averages of cross correlations between the multiple second subsets and a corresponding reference Second Golay codeword subsets; extract phase difference between first averages and corresponding averages; and determine the frequency offset of the OFDM sequence in response to the phase differences.
 16. The OFDM receiver according to claim 14 wherein the timing circuit comprises: a first cross-correlation circuit that comprises first taps and is configured to search for the first Golay codeword; a second cross-correlation circuit that comprises second taps and is configured to search for the second Golay codeword; wherein the frequency offset determination circuit comprises multiple phase detectors that are configured to calculate phase differences between output signals of different taps of the first taps and the second taps; wherein the frequency offset determination circuit is configured to determine the frequency offset of the OFDM sequence in response to the phase differences.
 17. The OFDM receiver according to claim 16 wherein the multiple phase detectors comprise first phase detectors that are configured to calculate phase differences between output signals of different first taps; second phase detectors that are configured to calculate phase differences between output signals of different second taps.
 18. The OFDM receiver according to claim 16 wherein the multiple phase detectors comprise a phase detector that is configured to calculate a phase difference between a first output signal of a first tap and a second output signal of a second tap.
 19. A method for receiving and processing orthogonal frequency division multiplexing (OFDM) signals, the method comprises: receiving a stream of OFDM symbols; searching, by a timing circuit, in the stream of OFDM symbols, for a training sequence that comprises a first Golay codeword and a second Golay codeword; processing, by a timing circuit, the training sequence and extracting timing information about a timing of reception of OFDM symbols, out of the stream of OFDM symbols, that convey data; wherein a sum of an autocorrelation of the first Golay codeword and an autocorrelation of the second Golay codeword consists essentially of a delta function. 